Repurposing Germline Exomes of the Cancer Genome Atlas Demands a Cautious Approach and Sample-Specific Variant Filtering
نویسندگان
چکیده
When seeking to reproduce results derived from whole-exome or genome sequencing data that could advance precision medicine, the time and expense required to produce a patient cohort make data repurposing an attractive option. The first step in repurposing is setting some quality baseline for the data so that conclusions are not spurious. This is difficult because there can be variations in quality from center to center, clinic to clinic and even patient to patient. Here, we assessed the quality of the whole-exome germline mutations of TCGA cancer patients using patterns of nucleotide substitution and negative selection against impactful mutations. We estimated the fraction of false positive variant calls for each exome with respect to two gold standard germline exomes, and found large variability in the quality of SNV calls between samples, cancer subtypes, and institutions. We then demonstrated how variant features, such as the average base quality for reads supporting an allele, can be used to identify sample-specific filtering parameters to optimize the removal of false positive calls. We concluded that while these germlines have many potential applications to precision medicine, users should assess the quality of the available exome data prior to use and perform additional filtering steps.
منابع مشابه
CanVar: A resource for sharing germline variation in cancer patients
The advent of high-throughput sequencing has accelerated our ability to discover genes predisposing to disease and is transforming clinical genomic sequencing. In both contexts knowledge of the spectrum and frequency of genetic variation in the general population and in disease cohorts is vital to the interpretation of sequencing data. While population level data is becoming increasingly availa...
متن کاملAssociation of a New Germline Variant in the MUTYH DNA Glycosylase Gene with Colorectal Adenoma Transformation into Malignancy
Background: MUTYH DNA glycosylase germline mutations are linked to the recessive inheritance of multiple adenoma. Studies have revealed that germline mutations in this gene are ethnicity related. This study aimed to identify the germline mutations in MUTYH gene and determine their prevalence among Jordanian patients with colorectal adenoma. Methods: In this study, 150 colorectal adenoma patient...
متن کاملPopulation analysis of microsatellite genotypes reveals a signature associated with ovarian cancer
Ovarian cancer (OV) ranks fifth in cancer deaths among women, yet there remain few informative biomarkers for this disease. Microsatellites are repetitive genomic regions which we hypothesize could be a source of novel biomarkers for OV and have traditionally been under-appreciated relative to Single Nucleotide Polymorphisms (SNPs). In this study, we explore microsatellite variation as a potent...
متن کاملImmune DNA signature of T-cell infiltration in breast tumor exomes
Tumor infiltrating lymphocytes (TILs) have been associated with favorable prognosis in multiple tumor types. The Cancer Genome Atlas (TCGA) represents the largest collection of cancer molecular data, but lacks detailed information about the immune environment. Here, we show that exome reads mapping to the complementarity-determining-region 3 (CDR3) of mature T-cell receptor beta (TCRB) can be u...
متن کاملPreferential Allele Expression Analysis Identifies Shared Germline and Somatic Driver Genes in Advanced Ovarian Cancer
Identifying genes where a variant allele is preferentially expressed in tumors could lead to a better understanding of cancer biology and optimization of targeted therapy. However, tumor sample heterogeneity complicates standard approaches for detecting preferential allele expression. We therefore developed a novel approach combining genome and transcriptome sequencing data from the same sample...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
دوره 21 شماره
صفحات -
تاریخ انتشار 2016